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Abstract 

Several researchers, including Leonid Levin, Gerard 't Hooft, and Stephen Wolfram, have 
argued that quantum mechanics will break down before the factoring of large numbers becomes 
possible. If this is true, then there should be a natural set of quantum states that can account for 
all quantum computing experiments performed to date, but not for Shor's factoring algorithm. 
We investigate as a candidate the set of states expressible by a polynomial number of additions 
and tensor products. Using a recent lower bound on multilinear formula size due to Raz, we 
then show that states arising in quantum error-correction require rf l ^° gn ^ additions and tensor 
products even to approximate, which incidentally yields the first superpolynomial gap between 
general and multilinear formula size of functions. More broadly, we introduce a complexity 
classification of pure quantum states, and prove many basic facts about this classification. Our 
goal is to refine vague ideas about a breakdown of quantum mechanics into specific hypotheses 
that might be experimentally testable in the near future. 

1 Introduction 

QC of the sort that factors long numbers seems firmly rooted in science fiction . . . The 
present attitude would be analogous to, say, Maxwell selling the Daemon of his famous 
thought experiment as a path to cheaper electricity from heat. — Leonid Levin |35j 

Quantum computing presents a dilemma: is it reasonable to study a type of computer that has 
never been built, and might never be built in one's lifetime? Some researchers strongly believe the 
answer is 'no.' Their objections generally fall into four categories: 

(A) There is a fundamental physical reason why large quantum computers can never be built. 

(B) Even if (A) fails, large quantum computers will never be built in practice. 

(C) Even if (A) and (B) fail, the speedup offered by quantum computers is of limited theoretical 
interest. 

(D) Even if (A), (B), and (C) fail, the speedup is of limited practical value. 1 
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Institute (Waterloo, Canada). Supported by an NSF Graduate Fellowship and by the Defense Advanced Research 
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1 Because of the 'even if clauses, the objections seem to us logically independent, so that there are 16 possible 
positions regarding them (or 15 if one is against quantum computing). We ignore the possibility that no speedup 
exists, in other words that BPP = BQP. By 'large quantum computer' we mean any computer much faster than its 
best classical simulation, as a result of asymptotic complexity rather than the speed of elementary operations. Such 
a computer need not be universal; it might be specialized for (say) factoring. 
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The objections can be classified along two axes: 

Theoretical Practical 
Physical (A) (B) 

Algorithmic (C) (D) 

This paper focuses on objection (A). Its goal is not to win a debate about this objection, but 
to lay the groundwork for a rigorous discussion, and thus hopefully lead to new science. Section 
121 provides the philosophical motivation for our paper, by examining the arguments of several 
quantum computing skeptics, including Leonid Levin, Gerard 't Hooft, and Stephen Wolfram. It 
concludes that a key weakness of their arguments is their failure to answer the following question: 
Exactly what property separates the quantum states we are sure we can create, from those that 
suffice for Shor's factoring algorithm? We call such a property a Sure/Shor separator. Section |3] 
develops a complexity theory of pure quantum states, that studies possible Sure/Shor separators. 
In particular, it introduces tree states, which informally are those states \ip) E Tlf™ expressible by 
a polynomial-size 'tree' of addition and tensor product gates. For example, a |0)® n + f3 \ l)® n and 
(a |0) + (3 |l))® n are both tree states. Section |I] investigates basic properties of this class of states. 
Among other results, it shows that any tree state is representable by a tree of polynomial size and 
logarithmic depth; and that most states do not even have large inner product with any tree state. 

Our main results, proved in Section [51 are lower bounds on tree size for various natural families 
of quantum states. In particular, Section 15.11 analyzes "subgroup states," which are uniform 
superpositions \S) over all elements of a subgroup S < Z?>. The importance of these states arises 
from their central role in stabilizer codes, a type of quantum error-correcting code. We first show 
that if S is chosen uniformly at random, then with high probability \S) cannot be represented by 
any tree of size n°^ ogn \ This result has a corollary of independent complexity-theoretic interest: 
the first super polynomial gap between the formula size and the multilinear formula size of a function 
/ : {0, l} n — ► R. We then present two improvements of our basic lower bound. First, we show that 
a random subgroup state cannot even be approximated well in trace distance by any tree of size 
n °(iogn)^ Second, we "derandomize" the lower bound, by using Reed-Solomon codes to construct 
an explicit subgroup state with tree size n^ logn ). 

Section 15.21 analyzes the states that arise in Shor's factoring algorithm — for example, a uniform 
superposition over all multiples of a fixed positive integer p, written in binary. Originally, we had 
hoped to show a superpolynomial tree size lower bound for these states as well. However, we are 
only able to show such a bound assuming a number-theoretic conjecture. 

Our lower bounds use a sophisticated recent technique of Raz |411 142j . which was introduced 
to show that the permanent and determinant of a matrix require superpolynomial-size multilinear 
formulas. Currently, Raz's technique is only able to show lower bounds of the form n^ logn \ but 
we conjecture that 2^( n ) lower bounds hold in all of the cases discussed above. 

One might wonder how tree size relates to more physical properties of quantum states, such as 
their robustness to decoherence. Section [5.31 addresses this question. In particular, it shows that 
if | S) is a superposition over codewords of any sufficiently good erasure code, then \S) has tree size 
n fi(iogn)^ although not vice versa. It also argues that Raz's lower bound technique is connected to 
a notion called "persistence of entanglement," but gives examples showing that the connection is 
not exact. 

Section El addresses the following question. If the state of a quantum computer at every time 
step is a tree state, then can the computer be simulated classically? In other words, letting 
TreeBQP be the class of languages accepted by such a machine, does TreeBQP = BPP? A positive 
answer would make tree states more attractive as a Sure/Shor separator. For once we admit any 
states incompatible with the polynomial-time Church- Turing thesis, it seems like we might as well 



2 



go all the way, and admit all states preparable by polynomial-size quantum circuits! Although we 
leave this question open, we do show that TreeBQP C Z3 n rig, where Z3 n rig is the third level of 
the polynomial hierarchy PH. By contrast, it is conjectured that BQP <f_ PH, though admittedly 
not on strong evidence. 

Section [7| discusses the implications of our results for experimental physics. It advocates a 
dialectic between theory and experiment, in which theorists would propose a class of quantum states 
that encompasses everything seen so far, and then experimenters would try to prepare states not in 
that class. It also asks whether states with super polynomial tree size have already been observed 
in condensed-matter systems; and more broadly, what sort of evidence is needed to establish a 
state's existence. Other issues addressed in Section [7] include how to deal with mixed states and 
particle position and momentum states, and the experimental relevance of asymptotic bounds. 

Finally, two appendices investigate quantum state complexity measures other than tree size. 
Appendix El shows relationships among tree size, circuit size, bounded-depth tree size, Vidal's x 
complexity [If)], an d several other measures. It also relates questions about quantum state classes 
to more traditional questions about computational complexity classes. Appendix studies a 
weakening of tree size called "manifestly orthogonal tree size," and shows that this measure can 
sometimes be characterized exactly, enabling us to prove exponential lower bounds. Our techniques 
in Appendix^] might be of independent interest to complexity theorists. 

We conclude in Section |H] with some open problems. 

2 How Quantum Mechanics Could Fail 

This section discusses objection (A), that quantum computing is impossible for a fundamental 
physical reason. Among computer scientists, this objection is most closely associated with Leonid 
Levin [SSI- 2 The following passage captures much of the flavor of his critique: 

The major problem [with quantum computing] is the requirement that basic quantum 
equations hold to multi-hundredth if not millionth decimal positions where the signifi- 
cant digits of the relevant quantum amplitudes reside. We have never seen a physical 
law valid to over a dozen decimals. Typically, every few new decimal places require 
major rethinking of most basic concepts. Are quantum amplitudes still complex num- 
bers to such accuracies or do they become quaternions, colored graphs, or sick-humored 
gremlins? |35| 

Among other things, Levin argues that quantum computing is analogous to the unit-cost arith- 
metic model, and should be rejected for essentially the same reasons; that claims to the contrary 
rest on a confusion between metric and topological approximation; that quantum fault-tolerance 
theorems depend on extravagant assumptions; and that even if a quantum computer failed, we could 
not measure its state to prove a breakdown of quantum mechanics, and thus would be unlikely to 
learn anything new. 

A few responses to Levin's arguments can be offered immediately. First, even classically, one 
can flip a coin a thousand times to produce probabilities of order 2~ 1000 . Should one dismiss such 
probabilities as unphysical? At the very least, it is not obvious that amplitudes should behave 

2 Since this paper was written, Oded Goldreich 2E>] has also put forward an argument against quantum computing. 
Compared to Levin's arguments, Goldreich's is easily understood: he believes that Shor states have exponential "non- 
degeneracy" and therefore take exponential time to prepare, and that there is no burden on those who hold this view 
to suggest a definition of non-degeneracy. 
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differently than probabilities with respect to error — since both evolve linearly, and neither is directly 
observable. 

Second, if Levin believes that quantum mechanics will fail, but is agnostic about what will 
replace it, then his argument can be turned around. How do we know that the successor to 
quantum mechanics will limit us to BPP, rather than letting us solve (say) PSPACE-complete 
problems? This is more than a logical point. Abrams and Lloyd [3] argue that a wide class of 
nonlinear variants of the Schrodinger equation would allow NP-complete and even # P-complete 
problems to be solved in polynomial time. And Penrose |39| . who proposed a model for 'objective 
collapse' of the wavefunction, believes that his proposal takes us outside the set of computable 
functions entirely! 

Third, to falsify quantum mechanics, it would suffice to show that a quantum computer evolved 
to some state far from the state that quantum mechanics predicts. Measuring the exact state is 
unnecessary. Nobel prizes have been awarded in the past 'merely' for falsifying a previously held 
theory, rather than replacing it by a new one. An example is the physics Nobel awarded to Fitch 
|19| and Cronin ^Tj\ in 1980 for discovering CP symmetry violation. 

Perhaps the key to understanding Levin's unease about quantum computing lies in his remark 
that "we have never seen a physical law valid to over a dozen decimals." Here he touches on a 
serious epistemological question: How far should we extrapolate from today's experiments to where 
quantum mechanics has never been tested? We will try to address this question by reviewing the 
evidence for quantum mechanics. For our purposes it will not suffice to declare the predictions of 
quantum mechanics "verified to one part in a trillion," because we need to distinguish at least three 
different types of prediction: interference, entanglement, and Schrodinger cats. Let us consider 
these in turn. 

(1) Interference. If the different paths that an electron could take in its orbit around a 
nucleus did not interfere destructively, canceling each other out, then electrons would not 
have quantized energy levels. So being accelerating electric charges, they would lose energy 
and spiral into their respective nuclei, and all matter would disintegrate. That this has not 
happened — together with the results of (for example) single-photon double-slit experiments — 
is compelling evidence for the reality of quantum interference. 

(2) Entanglement. One might accept that a single particle's position is described by a wave 
in three-dimensional phase space, but deny that two particles are described by a wave in six- 
dimensional phase space. However, the Bell inequality experiments of Aspect et al. |8_ and 
successors have convinced all but a few physicists that quantum entanglement exists, can be 
maintained over large distances, and cannot be explained by local hidden-variable theories. 

(3) Schrodinger Cats. Accepting two- and three-particle entanglement is not the same as 
accepting that whole molecules, cats, humans, and galaxies can be in coherent superposition 
states. However, recently Arndt et al. [7j have performed the double-slit interference exper- 
iment using Cqo molecules (buckyballs) instead of photons; while Friedman et al. [201 have 
found evidence that a superconducting current, consisting of billions of electrons, can enter 
a coherent superposition of flowing clockwise around a coil and flowing counterclockwise (see 
Leggett [31] for a survey of such experiments). Though short of cats, these experiments at 
least allow us to say the following: if we could build a general-purpose quantum computer with 
as many components as have already been placed into coherent superposition, then on certain 
problems, that computer would outperform any computer in the world today. 

Having reviewed some of the evidence for quantum mechanics, we must now ask what alter- 
natives have been proposed that might also explain the evidence. The simplest alternatives are 
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those in which quantum states "spontaneously collapse" with some probability, as in the GRW 
(Ghirardi- Rimini- Weber) theory [23 . 3 The drawbacks of the GRW theory include violations of en- 
ergy conservation, and parameters that must be fine-tuned to avoid conflicting with experiments. 
More relevant for us, though, is that the collapses postulated by the theory are only in the po- 
sition basis, so that quantum information stored in internal degrees of freedom (such as spin) is 
unaffected. Furthermore, even if we extended the theory to collapse those internal degrees, large 
quantum computers could still be built. For the theory predicts roughly one collapse per particle 
per 10 15 seconds, with a collapse affecting everything in a 10 _7 -meter vicinity. So even in such a 
vicinity, one could perform a computation involving (say) 10 10 particles for 10 5 seconds. Finally, as 
pointed out to us by Rob Spekkens, standard quantum error-correction techniques might be used 
to overcome even GRW-type decoherence. 

A second class of alternatives includes those of 't Hooft jHU] and Wolfram jlH], in which some- 
thing like a deterministic cellular automaton underlies quantum mechanics. On the basis of his 
theory, 't Hooft predicts that "[i]t will never be possible to construct a 'quantum computer' that 
can factor a large number faster, and within a smaller region of space, than a classical machine 
would do, if the latter could be built out of parts at least as large and as slow as the Planckian 
dimensions" |3L)| . Similarly, Wolfram states that "[ijndeed within the usual formalism [of quantum 
mechanics] one can construct quantum computers that may be able to solve at least a few specific 
problems exponentially faster than ordinary Turing machines. But particularly after my discov- 
eries ... I strongly suspect that even if this is formally the case, it will still not turn out to be 
a true representation of ultimate physical reality, but will instead just be found to reflect various 
idealizations made in the models used so far" |48| p. 771]. 

The obvious question then is how these theories account for Bell inequality violations. We 
confess to being unable to understand 't Hooft's answer to this question, except that he believes 
that the usual notions of causality and locality might no longer apply in quantum gravity. As for 
Wolfram's theory, which involves "long-range threads" to account for Bell inequality violations, we 
argued in [Q that it fails Wolfram's own desiderata of causal and relativistic invariance. 

So the challenge for quantum computing skeptics is clear. Ideally, come up with an alternative 
to quantum mechanics — even an idealized toy theory — that can account for all present-day exper- 
iments, yet would not allow large-scale quantum computation. Failing that, at least say what you 
take quantum mechanics' domain of validity to be. One way to do this would be to propose a set 
S of quantum states that you believe corresponds to possible physical states of affairs. 4 The set 
S must contain all "Sure states" (informally, the states that have already been demonstrated in 
the lab), but no "Shor states" (again informally, the states that can be shown to suffice for factor- 
ing, say, 500-digit numbers). If S satisfies both of these constraints, then we call S a Sure/ Shor 
separator (see Figure 1). 

Of course, an alternative theory need not involve a sharp cutoff between possible and impossible 
states. So it is perfectly acceptable for a skeptic to define a "complexity measure" C for quan- 
tum states, and then say something like the following: If \ip n ) is a state of n spins, and C (\ip n )) is 
at most, say, n 2 , then I predict that \ip n ) can be prepared using only "polynomial effort. " Also, once 
prepared, \ip n ) will be governed by standard quantum mechanics to extremely high precision. All 
states created to date have had small values of C (\ip n )). However, if C (|VVi)) grows as, say, 2 n , 
then I predict that \ip n ) requires "exponential effort" to prepare, or else is not even approximately 
governed by quantum mechanics, or else does not even make sense in the context of an alternative 

3 Penrose |39| has proposed another such theory, but as mentioned earlier, his theory suggests that the quantum 
computing model is too restrictive. 

4 A skeptic might also specify what happens if a state € S is acted on by a unitary U such that U \%l>) £ S, but 
this will not be insisted upon. 
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/ Allowed by GRW theory \ 




Figure 1: A Sure/Shor separator must contain all Sure states but no Shor states. That is why 
neither local hidden variables nor the GRW theory yields a Sure/Shor separator. 

theory. The states that arise in Shor's factoring algorithm have exponential values of C (\ip n )). 
So as my Sure/Shor separator, I propose the set of all infinite families of states {|V'n)} n >i; where 
\ip n ) has n qubits, such that C (\ip n )) < p{n) for some polynomial p. 

To understand the importance of Sure/Shor separators, it is helpful to think through some 
examples. A major theme of Levin's arguments was that exponentially small amplitudes are 
somehow unphysical. However, clearly we cannot reject all states with tiny amplitudes — for would 
anyone dispute that the state 2" 5000 (|0) + |i})® 10000 [ s formed whenever 10,000 photons are each 
polarized at 45°? Indeed, once we accept \ip) and \<p) as Sure states, we are almost forced to accept 
| ip) ® | <p) as well — since we can imagine, if we like, that \ip) and \ip) are prepared in two separate 
laboratories. 5 So considering a Shor state such as 

j 2 n -l 

l^) = ^E |r)|x r modiV), 

r=0 

what property of this state could quantum computing skeptics latch onto as being physically ex- 
travagant? They might complain that |3>) involves entanglement across hundreds or thousands 
of particles; but as mentioned earlier, there are other states with that same property, namely the 
"Schrodinger cats" (|0)® n + |l)® n ) /\/2, that should be regarded as Sure states. Alternatively, 
the skeptics might object to the combination of exponentially small amplitudes with entanglement 
across hundreds of particles. However, simply viewing a Schrodinger cat state in the Hadamard 
basis produces an equal superposition over all strings of even parity, which has both properties. 
We seem to be on a slippery slope leading to all of quantum mechanics! Is there any defensible 
place to draw a line? 

The dilemma above is what led us to propose tree states as a possible Sure/Shor separator. 
The idea, which might seem more natural to logicians than to physicists, is this. Once we accept 
the linear combination and tensor product rules of quantum mechanics — allowing a + (3 \ (p) and 
\ip) <8> \<p) into our set S of possible states whenever , \ip) 6 S — one of our few remaining hopes 

5 A reviewer comments that in Chern-Simons theory (for example), there is no clear tensor product decomposition. 
However, the only question that concerns us is whether ® \<p) is a Sure state, given that \tp) and \ip) are both Sure 
states that are well-described in tensor product Hilbert spaces. 



6 




|0>1 M)i |0> 2 |1> 2 



Figure 2: Expressing (|00) + 1 01) + |10) — |11)) /2 by a tree of linear combination and tensor prod- 
uct gates, with scalar multiplication along edges. Subscripts denote the identity of a qubit. 

for keeping S a proper subset of the set of all states is to impose some restriction on how those two 
rules can be iteratively applied. In particular, we could let S be the closure of {|0} , |1)} under a 
polynomial number of linear combinations and tensor products. That is, S is the set of all infinite 
families of states {\ipn)} n >i with \i/) n ) 6 Hf" '■> such that \ij) n ) can be expressed as a "tree" involving 
at most p{n) addition, tensor product, |0), and |1) gates for some polynomial p (see Figure 2). 

To be clear, we are not advocating that "all states in Nature are tree states" as a serious physical 
hypothesis. Indeed, even if we believed firmly in a breakdown of quantum mechanics, 6 there are 
other choices for the set S that seem equally reasonable. For example, define orthogonal tree 
states similarly to tree states, except that we can only form the linear combination a + (3 \<p) if 
(i/)\<p) = 0. Rather than choose among tree states, orthogonal tree states, and the other candidate 
Sure/Shor separators that occurred to us, our approach will be to prove everything we can about 
all of them. If we devote more space to tree states than to others, that is simply because tree 
states are the subject of our most interesting results. On the other hand, if we show (for example) 
that {|VVi)} is not a tree state, then we have also shown that {|Y> n )} is not an orthogonal tree state. 
So many candidate separators are related to each other; and indeed, their relationships will be a 
major theme of the paper. 

Let us summarize. To debate whether quantum computing is fundamentally impossible, we 
need at least one proposal for how it could be impossible. Since even skeptics admit that quantum 
mechanics is valid within some "regime," a key challenge for any such proposal is to separate 
the regime of acknowledged validity from the quantum computing regime. Though others will 
disagree, we do not see any choice but to identify those two regimes with classes of quantum states. 
For gates and measurements that suffice for quantum computing have already been demonstrated 
experimentally. Thus, if we tried to identify the two regimes with classes of gates or measurements, 
then we could equally well talk about the class of states on which all 1- and 2-qubit operations 
behave as expected. A similar argument would apply if we identified the two regimes with classes 
of quantum circuits — since any "memory" that a quantum system retains of the previous gates in a 
circuit, is part of the system's state by definition. So: states, gates, measurements, circuits — what 
else is there? 

We should stress that none of the above depends on the interpretation of quantum mechanics. In 
particular, it is irrelevant whether we regard quantum states as "really out there" or as representing 
subjective knowledge — since in either case, the question is whether there can exist systems that we 
would describe by \tp) based on their observed behavior. 

6 which we don't 
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Once we agree to seek a Sure/Shor separator, we quickly find that the obvious ideas — based 
on precision in amplitudes, or entanglement across of hundreds of particles — are nonstarters. The 
only idea that we have found plausible is to limit the class of allowed quantum states to those with 
some kind of succinct representation. That still leaves numerous possibilities; and for each one, 
it might be a difficult problem to decide whether a given \ifi) is succinctly representable or not. 
Thus, constructing a useful theory of Sure/Shor separators will not be easy. But we should start 
somewhere. 

3 Classifying Quantum States 

In both quantum and classical complexity theory, the objects studied are usually sets of languages or 
Boolean functions. However, a generic n-qubit quantum state requires exponentially many classical 
bits to describe, and this suggests looking at the complexity of quantum states themselves. That is, 
which states have polynomial-size classical descriptions of various kinds? This question has been 
studied from several angles by Aharonov and Ta-Shma [3]; Janzing, Wocjan, and Beth Vidal 
|4fij : and Green et al. [2HJ- Here we propose a general framework for the question. For simplicity, 
we limit ourselves to pure states \ip n ) G 1-L® n with the fixed orthogonal basis {\x} : x 6 {0,1}™}. 
Also, by 'states' we mean infinite families of states {|V'n}} n >i- 

Like complexity classes, pure quantum states can be organized into a hierarchy (see Figure 3). 
At the bottom are the classical basis states, which have the form \x) for some x £ {0, 1}™. We 
can generalize classical states in two directions: to the class ®i of separable states, which have 
the form [a\ |0) + (3\ |1}) ® • • • (g) (a n |0) + (3 n |1)); and to the class Ei, which consists of all states 
\ipri} that are superpositions of at most p(n) classical states, where p is a polynomial. At the 
next level, <8>2 contains the states that can be written as a tensor product of Ei states, with qubits 
permuted arbitrarily. Likewise, E2 contains the states that can be written as a linear combination 
of a polynomial number of <8>i states. We can continue indefinitely to E3, (8)3, etc. Containing 
the whole 'tensor-sum hierarchy' UkE^ = Uk®k is the class Tree, of all states expressible by a 
polynomial-size tree of additions and tensor products nested arbitrarily. Formally, Tree consists of 
all states \ip n ) such that TS (\ip n )) < p(n) for some polynomial p, where the tree size TS (|VVt)) is 
defined as follows. 

Definition 1 A quantum state tree over Tif n is a rooted tree where each leaf vertex is labeled with 
a |0) + /3 11) for some a, (3 £ C, and each non-leaf vertex (called a gate) is labeled with either + or 
(g). Each vertex v is also labeled with a set S (v ) C {1, . . . , n}, such that 

(i) If v is a leaf then \S (v)\ = 1, 

(ii) If v is the root then S (v) = {1, . . . , n}, 

(in) If v is a + gate and w is a child ofv, then S (w) = S(y), 

(iv) If v is a (g> gate and w±, . . . ,wp, are the children of v, then S (wi) , . . . , S (w^) are pairwise 
disjoint and form a partition of S (v) . 

Finally, if v is a + gate, then the outgoing edges of v are labeled with complex numbers. For 
each v, the subtree rooted at v represents a quantum state of the qubits in S (v) in the obvious way. 
We require this state to be normalized for each v. 7 

7 Requiring only the whole tree to represent a normalized state clearly yields no further generality. 
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Figure 3: Relations among quantum state classes. 



We say a tree is orthogonal if it satisfies the further condition that if v is a + gate, then any 
two children w±,W2 of v represent , \ip2) with (Y'llV^) = 0. If the condition (V'ilV'2) = can 
be replaced by the stronger condition that for all basis states \x), either (ipi\x) = or (tp2\x) = 0, 
then we say the tree is manifestly orthogonal. Manifest orthogonality is an extremely unphysical 
definition; we introduce it only because it is interesting from a lower bounds perspective. 

For reasons of convenience, we define the size \T\ of a tree T to be the number of leaf vertices. 
Then given a state 6 7if n , the tree size TS (\if))) is the minimum size of a tree that represents 
The orthogonal tree size OTS and manifestly orthogonal tree size MOTS are defined 
similarly. Then OTree is the class of \ip n ) such that OTS (\4>n)) < P ( n ) f° r some polynomial p, and 
MOTree is the class such that MOTS (|Y> n )) < V ( n ) f° r some p. 

It is easy to see that 



for every |^), and that the set of \tp) such that TS (\ip)) < 2 n has measure in 7if n . Two other 
important properties of TS and OTS are as follows: 

Proposition 2 

(i) TS and OTS are invariant under local? basis changes, up to a constant factor of 2. 

(ii) If |0) is obtained from \ip) by applying a k-qubit unitary, then TS(|^)) < A;4 fc TS(|^)) and 
OTS (|0)) < M fc OTS(|V>>). 

Proof. 

(i) Simply replace each occurrence of |0) in the original tree by a tree for a |0) + j3 |1), and each 
occurrence of |1) by a tree for 7 |0) + 8 |1), as appropriate. 

8 Several people told us that a reasonable complexity measure must be invariant under all basis changes. Alas, 
this would imply that all pure states have the same complexity! 



n < TS (|^)) < OTS (|V>)) < MOTS (|V>)) < nZ 
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(ii) Suppose without loss of generality that the gate is applied to the first k qubits. Let T be a 
tree representing and let T y be the restriction of T obtained by setting the first k qubits to 
y G {0, l} fc . Clearly \T y \ < \T\. Furthermore, we can express \<p) in the form Yl y e{o i} fe S y T y , 
where each S y represents a fc-qubit state and hence is expressible by a tree of size k2 k . 

m 

We can also define the e- approximate tree size TS e to be the minimum size of a tree 
representing a state \ip) such that | (ift\<p) | 2 > 1— e, and define OTS e (\tp)) and MOTS e (\tp)) similarly. 

Definition 3 An arithmetic formula (over the ring C and n variables) is a rooted binary tree where 
each leaf vertex is labeled with either a complex number or a variable in {x\, . . . , x n }, and each non- 
leaf vertex is labeled with either + or x . Such a tree represents a polynomial p (x\, . . . , x n ) in the 
obvious way. We call a polynomial multilinear if no variable appears raised to a higher power than 
1, and an arithmetic formula multilinear if the polynomials computed by each of its subtrees are 
multilinear. 

The size |3>| of a multilinear formula is the number of leaf vertices. Given a multilinear 
polynomial p, the multilinear formula size MFS (p) is the minimum size of a multilinear formula 
that represents p. Then given a function / : {0, 1}™ — > C, we define 

MFS (/) = min MFS (p) . 

p : p(x)=f(x) Va;e{0,l} n 

(Actually p turns out to be unique [SHI - ) We can a ls° define the e-approximate multilinear formula 
size of /, 

MFS e (/) = min MFS (p) 

P ■ ||p-/Hl<e 

where \\p — /H2 = J2xe{o 1}" \p( x ) ~ f( x )\ 2 - (This metric is closely related to the inner prod- 
uct Y^ x P ( X T f but is often more convenient to work with.) Now given a state \ip) = 
Sxe{o i}" a ' x \ x ) m U 1 ^ fi> b e the function from {0, l} n to C defined by (x) = a x . 

Theorem 4 For all \ip), 

(1) MFS (fy) = O (TS (|^))). 

(ii) TS(|V)) = 0(MFS(/ v ,)+n). 

(Hi) MFS 5 (U) = (TS e (\if)))) where 5 = 2- 

(iv) TS 2£ (|V')) = C»(MFS e (/^ + n). 

Proof. 

(i) Given a tree representing replace every unbounded fan-in gate by a collection of binary 
gates, every by x, every vertex by Xi, and every |0)j vertex by a formula for 1 — x%. 
Push all multiplications by constants at the edges down to x gates at the leaves. 

(ii) Given a multilinear formula $ for let p(v) be the polynomial computed at vertex v of <!>, 
and let S (v) be the set of variables that appears in p (v). First, call $ syntactic if at every 
x gate with children v and w, S (v ) n S (w) = 0. A lemma of Raz |41| states that we can 
always make $ syntactic without increasing its size. 
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Second, at every + gate u with children v and w, enlarge both S (v) and S (w) to S (v)US (w), 
by multiplying p (v) by x\ + (1 — x«) for every Xi € S (w) \ S (v), and multiplying p (w) by 
Xi + (1 — Xj) for every Xj E S (v) \ S (w). Doing this does not invalidate any x gate that 
is an ancestor of u, since by the assumption that $ is syntactic, p (u) is never multiplied by 
any polynomial containing variables in S (v) U S (w). Similarly, enlarge S (r) to {x\, . . . , x n } 
where r is the root of <£. 

Third, call v max-linear if \S (v)\ = 1 but |<f?(io)| > 1 where w is the parent of v. If v is 
max- linear and p (v) = a + bx{, then replace the tree rooted at v by a tree computing a \ + 
(a + 6) ll)^. Also, replace all multiplications by constants higher in $ by multiplications at 
the edges. (Because of the second step, there are no additions by constants higher in <£.) 
Replacing every x by tg> then gives a tree representing \ip), whose size is easily seen to be 
0(|$|+n). 

(hi) Apply the reduction from part (i). Let the resulting multilinear formula compute polynomial 
p; then 

\p(x)-U(x)\ 2 = 2-2 p(x)Mxj<2-2VT^ = 5. 

xe{o,i} n xe{o,i} n 

(iv) Apply the reduction from part (ii). Let ((3 x ) x€ {o n« be the resulting amplitude vector; since 
this vector might not be normalized, divide each (5 X by |Ae| 2 to produce f3' x . Then 

2 



xe{o,i} n 



xe{o,i}" 



> 1 - - 



E i^-A B i 2 + ./ E 



xe{o,i}' ; 



>l-^(2^) 2 = l-2e. 



a; Q^a; 



xG{0,l} r 



Besides Tree, OTree, and MOTree, four other classes of quantum states deserve mention: 

Circuit, a circuit analog of Tree, contains the states \ip n ) = ^2 X ® X \x) such that for all n, there 
exists a multilinear arithmetic circuit of size p (n) over the complex numbers that outputs a x given 
x as input, for some polynomial p. (Multilinear circuits are the same as multilinear trees, except 
that they allow unbounded fanout — that is, polynomials computed at intermediate points can be 
reused arbitrarily many times.) 

AmpP contains the states \ip n ) = ^2 X a x \x) such that for all n, b, there exists a classical circuit 
of size p (n + b) that outputs a x to b bits of precision given x as input, for some polynomial p. 

Vidal contains the states that are 'polynomially entangled' in the sense of Vidal |46| . Given a 
partition of {1, . . . , n} into A and B, let \A (IV'n)) be the minimum k for which \ip n ) can be written as 
Si=i a i Iff)® Wf )' where Iff) and \<pf ) are states of qubits in A and B respectively, (xa (\tpn)) 
is known as the Schmidt rank; see [S7j for more information.) Let x (iV'n)) = max ,4 XA (iV'n))- Then 
IVVi) S Vidal if and only if x (l^n)) < P (n) for some polynomial p. 

vUP contains the states \ip n ) such that for all n and e > 0, there exists a quantum circuit of size 
p (n + log (1/e)) that maps the all-0 state to a state some part of which has trace distance at most 
1 — e from \ip n ), for some polynomial p. Because of the Solovay-Kitaev Theorem [321 137] . U^P is 
invariant under the choice of universal gate set. 
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4 Basic Results 



Before studying the tree size of specific quantum states, we would like to know in general how tree 
size behaves as a complexity measure. In this section we prove three rather nice properties of tree 
size. 

Theorem 5 For all e > 0, there exists a tree representing \ip) of size O ^TS(|^)) 1+e ^ and depth 

O (logTS (|^>))), as well as a manifestly orthogonal tree of size O (MOTS(|V» 1+£ ) and depth 
0(logMOTS(|^))). 

Proof. A classical theorem of Brent ^2] says that given an arithmetic formula there exists 
an equivalent formula of depth O (log |<3?|) and size O (|<J?| C ), where c is a constant. Bshouty, Cleve, 
and Eberly ^3] (see also Bonet and Buss |1U| ) improved Brent's theorem to show that c can be 
taken to be 1 + e for any e > 0. So it suffices to show that, for 'division-free' formulas, these 
theorems preserve multilinearity (and in the MOTS case, preserve manifest orthogonality). 

Brent's theorem is proven by induction on |<&|. Here is a sketch: choose a sub formula I of <E> 
size between |<1>| /3 and 2 |<3?| /3 (which one can show always exists). Then identifying a subformula 
with the polynomial computed at its root, (x) can be written as G (x) + H (x) I (x) for some 
formulas G and H. Furthermore, G and H are both obtainable from <I> by removing / and then 
applying further restrictions. So \G\ and \H\ are both at most |<1>| — |J| + (1). Let $ be a formula 
equivalent to $ that evaluates G, H, and I separately, and then returns G (x) + H (x) I (x). Then 

is larger than |$| by at most a constant factor, while by the induction hypothesis, we can 
assume the formulas for G, H, and / have logarithmic depth. Since the number of induction steps 
is O (log l^l), the total depth is logarithmic and the total blowup in formula size is polynomial in 
|$|. Bshouty, Cleve, and Eberly's improvement uses a more careful decomposition of but the 
basic idea is the same. 

Now, if $ is syntactic multilinear, then clearly G, H, and / are also syntactic multilinear. 
Furthermore, H cannot share variables with I, since otherwise a subformula of $ containing I 
would have been multiplied by a subformula containing variables from I. Thus multilinearity is 
preserved. To see that manifest orthogonality is preserved, suppose we are evaluating G and H 
'bottom up,' and let G v and H v be the polynomials computed at vertex v of Let vq = root (I), 
let t>i be the parent of vo, let V2 be the parent of v±, and so on until = root ($). It is clear 
that, for every x, either G vo (x) = or H VQ (x) = 0. Furthermore, suppose that property holds 
for G Vi _ 1 , H Vi _ 1 ; then by induction it holds for G Vi ,H Vi . If Vi is a x gate, then this follows 
from multilinearity (if \ip) and \(p) are manifestly orthogonal, then |0) £3 and |0) (g> \<p) are also 
manifestly orthogonal). If V{ is a + gate, then letting supp (p) be the set of x such that p (x) ^ 0, 
any polynomial p added to G Vi _ 1 or H Vi _ 1 must have 

supp(p) n (supp (G^.J U supp (-H^.J) = 0, 

and manifest orthogonality follows. ■ 

Theorem 6 Any can be prepared by a quantum circuit of size polynomial in OTS Thus 
OTree C VP. 

Proof. Let r (\if>)) be the minimum size of a circuit needed to prepare l^) £ Hf n starting from 
|0)® n . We prove by induction on T that T (\ip)) < q (OTS for some polynomial q. The 

base case OTS (|^)) = 1 is clear. Let T be an orthogonal state tree for \ip), and assume without 
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loss of generality that every gate has fan-in 2 (this increases \T\ by at most a constant factor). 
Let T\ and T2 be the subtrees of root (T), representing states |V>i) and ^2) respectively; note that 
\T\ = |Ti| + |72|. First suppose root (T) is a <8) gate; then clearly V (\ip)) < T (|^i)) + T (|Y>2))- 

Second, suppose root (T) is a + gate, with |V>) = a + (3 \1jj2) and (V'ilV'2) = 0. Let ?7 be a 
quantum circuit that prepares |Y>i), and V be a circuit that prepares \ip2)- Then we can prepare 
a |0) |0)® n +/3 |1) U~W |0)® n . Observe that [/^F |O) 0n is orthogonal to |O) 0ri , since |Vi> = 17 |O) 0n 
is orthogonal to ^2) = V |0)® n . So applying a NOT to the first register, conditioned on the 
OR of the bits in the second register, yields |0) (g> (a \0} m + [3U~ l V |O) 0n ), from which we obtain 
a \ipi)+(3 ^2) by applying U to the second register. The size of the circuit used is O (\U\ + \ V\ + n), 
with a possible constant-factor blowup arising from the need to condition on the first register. If 
we are more careful, however, we can combine the 'conditioning' steps across multiple levels of the 
recursion, producing a circuit of size |V| +O (\U\ + n). By symmetry, we can also reverse the roles 
of U and V to obtain a circuit of size \U\ + O (\V\ + n). Therefore 

r (|Y>)) < min{r (k/>i)) + cT (|^» + en, cT (|Y> 2 )) + V (|^» + cn} 

for some constant c > 2. Solving this recurrence we find that T (\tp)) is polynomial in OTS (\4>))- 
m 

Theorem 7 If\ip) G 7if n is chosen uniformly at random under the Haar measure, then TSi/ig (IV')) 
2^(n) yjitfr probability 1 — o (1). 

Proof. To generate a uniform random state \tp) = ^ire{o \} n ctx \x), we can choose a x ,0 x 6 R 
for each a; independently from a Gaussian distribution with mean and variance 1, then let a x = 

[a x + i(3 x ^J I \f~R where R = J2 x e{o,i} n (&x + Let 



and let £ be the set of |^) for which |A^| < 2 n /5. We claim that Pr^ [\tp) G Q] = 1 - o (1). First, 
EX [i?] = 2 n+1 , so by a standard Hoeffding-type bound, Pr [R < 2 n ] is doubly-exponentially small 
in n. Second, assuming R>2 n , for each x 



Pr [x e A,/,] < Pr 



^2 1 

a *< 4 



erf I — - ] < 0.198, 



and the claim follows by a Chernoff bound. 

For g : {0, l} n — > R, let A g = {x : sgn (5 (x)) / sgn (Rea x )}, where sgn (y) is 1 if y > and —1 
otherwise. Then if £ £7, clearly 

'A/I ~ l A V>l 



2 \g(x)-U(x)\ 2 > 9 „ 

xe{o,i}' 1 

where /y, (x) = Rea x , and thus 

\A g \<(4\\g-Uf 2 + ^j 2™. 

Therefore to show that MFSx/^ (fy) = 2 n(n ) with probability 1 — o (1), we need only show that for 
almost all Boolean functions / : {0, 1}™ — > { — 1, 1}, there is no arithmetic formula $ of size 2°^ 
such that 

\{x : sgn ($ (x)) / / (x)}| < 0.49 • 2 n . 
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Here an arithmetic formula is real- valued, and can include addition, subtraction, and multiplication 
gates of fan-in 2 as well as constants. We do not need to assume multilinearity, and it is easy 
to see that the assumption of bounded fan-in is without loss of generality. Let W be the set 
of Boolean functions sign-represented by an arithmetic formula $ of size 2°("), in the sense that 
sgn($(x)) = f (x) for all x. Then it suffices to show that \W\ = 2 2 ° (n) , since the number of 
functions sign-represented on an 0.51 fraction of inputs is at most \ W\ ■ 2 2 ™^(°- 51 ). (Here H denotes 
the binary entropy function.) 

Let <1? be an arithmetic formula that takes as input the binary string well as 

constants c\,cz,.... Let <3? c denote $ under a particular assignment c to c% , , Then a result 

of Gashkov [22] (see also Turan and Vatan |44|). which follows from Warren's Theorem |47| in 

real algebraic geometry, shows that as we range over all c, <& c sign-represents at most (2 n+4 | <&!)'*' 
distinct Boolean functions, where |<3?| is the size of <3?. Furthermore, excluding constants, the 

/ \ 1*1 

number of distinct arithmetic formulas of size |$| is at most (3|$| 2 ) . When |$| = 2 o( - n \ this 



gives (^3 l^l 2 )'*' • (2 n+4 |$|) I<I>I = 2 2 ° (n) . We have shown that MFS 1/15 (fa) = 2 n ^; by Theorem 

H part (iii), this implies that TS 1/16 (|^)) = 2 n( - n \ m 

A corollary of Theorem [7| is the following 'nonamplification' property: there exist states that 
can be approximated to within, say, 1% by trees of polynomial size, but that require exponentially 
large trees to approximate to within a smaller margin (say 0.01%). 

Corollary 8 For all 5 £ (0, 1], there exists a state \ip) such thatTSs (\tp)) = n butTS e = 2 Q W 
where e = 5/32- 5 2 /4096. 

Proof. It is clear from Theorem [7| that there exists a state \(p) = Y2xe{0 i} n ° x \ x ) sucn that 

TS1/16 {\<P)) = 2 n(n) anda ™ = 0. Take \ip) = VT^5\0)® n + V6 \<p). Since |<V|0>^| 2 = 1-5, we 
have MOTS5 (IV')) = n - On the other hand, suppose some \4>) = X^e{o i} n As \ x ) w hh TS (\<f>)) = 
2°M satisfies \(M)\ 2 >l-e. Then 



£(A 



u, -(3 X ) 2 <2-2VT 



Thus, letting f v (x) = a x , we have MFS C (f v ) = O (TS (\<p))) where c = (2 - 2 V / T^e) /S. By 
Theorem H part (iv), this implies that TS 2c (|<p» = O (TS (](/>))) . But 2c = 1/16 when e = 
5/32 - 5 2 /4096, contradiction. ■ 



5 Lower Bounds 

We want to show that certain quantum states of interest to us are not represented by trees of 
polynomial size. At first this seems like a hopeless task. Proving superpolynomial formula- 
size lower bounds for 'explicit' functions is a notoriously hard open problem, as it would imply 
complexity class separations such as NC 1 7^ P. 

Here, though, we are only concerned with multilinear formulas. Could this make it easier 
to prove a lower bound? The answer is not obvious, but very recently, for reasons unrelated to 
quantum computing, Raz |411 142j showed the first superpolynomial lower bounds on multilinear 
formula size. In particular, he showed that multilinear formulas computing the permanent or 
determinant of an n x n matrix over any field have size n^ logn \ 
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Raz's technique is a beautiful combination of the Furst-Saxe-Sipser method of random restric- 
tions [22, with matrix rank arguments as used in communication complexity. We now outline 
the method. Given a function / : {0, 1}™ — ► C, let P be a partition of the input variables 
xi, . . . ,x n into two collections y = (yi, . . . , y n i%) and z = \Z\, . . . , z n / 2 )- This yields a function 

f P (y, z) : {0, l} n/2 x {0, l} n/2 -» C. Then let M f \ P be a 2 n / 2 x 2™/ 2 matrix whose rows are labeled 
by assignments y G {0, l} n//2 , and whose columns are labeled by assignments z G {0, l} n / 2 . The 
(y, z) entry of Mf|p is f P (y, z). Let rank (Mjip) be the rank of Mj|p over the complex numbers. 
Finally, let V be the uniform distribution over all partitions P. 

The following, Corollary 3.6 in |42j . is one statement of Raz's main theorem; recall that MFS (/) 
is the minimum size of a multilinear formula for /. 



Theorem 9 (|42j) Suppose that 



Pr 

Per 



rank (M f \ P ) > 2 



n/2-(n/2) 1 / 8 /2 



n 



-o(logn) 



Then MFS (/) = n n ( lo s n ). 



An immediate corollary yields lower bounds on approximate multilinear formula size. Given 



an JV x JV matrix M = (m^), let rank e (M) 



mm, 



. rank(L) where \\L — M\\ 2 



Corollary 10 Suppose that 



Pr 

Per 



rank e (M/ip) > 2 



n/2~(n/2)^ 8 /2 



n 



-o(logn) 



TTien MFS e (/) 



H(log n) 



Proof. Suppose MFS £ (/) = n°( lo s n ). Then for all 5 such that ||/ - g\\ 2 2 < e, we would have 



MFS (g) = n °^ n \ and therefore 

Pr frank (M a , P ) >2"/ 2 -^ 2 ) 1/8 / 2 

by TheoremEl But rank e (Mf\p) < rank (M g |pl, and hence 



n 



-r2(log n) 



Pr 

Per 



rank e (M /|P ) > 2 



n/2-(n/2) 1 / 8 /2 



-f2(log n) 



contradiction. ■ 

Another simple corollary gives lower bounds in terms of restrictions of /. Let IZi be the following 
distribution over restrictions R: choose 21 variables of / uniformly at random, and rename them 
y = (yi, . . . ,y{) and z = (z\, . . . , zi). Set each of the remaining n — 2l variables to or 1 uniformly 
and independently at random. This yields a restricted function fp (y,z). Let Mf|p be a 2 l x 2 l 
matrix whose (y, z) entry is fp (y, z). 



Corollary 11 Suppose that 



Pr 

ReTl t 



rank (M f \ R ) > 2 



n 



-o(logn) 



where I = n 5 for some constant 5 G (0, 1]. Then MFS (/) = nf 1 ^° sn K 
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Proof. Under the hypothesis, clearly there exists a fixed restriction g : {0, 1} — > C of /, which 
leaves 21 variables unrestricted, such that 

— o(logn) ^— o(logZ) 

Then by Theorem El 

MFS (/) > MFS (g) = Z n(log ° = n Q(logn) . 

■ 

We will apply Raz's theorem to obtain n n ^ ogn ^ tree size lower bounds for two classes of quantum 
states: states arising in quantum error-correction in Section f5, 11 and (assuming a number-theoretic 
conjecture) states arising in Shor's factoring algorithm in Section f5. 21 



Pr 

pgv 



rank 



[ M 9\P) 



> 2 



5.1 Subgroup States 

Let the elements of be labeled by n-bit strings. Given a subgroup S < 
subgroup state \S) as follows: 



Coset states arise as codewords in the class of quantum error-correcting codes known as stabilizer 
codes |16 ( \27 \ 14*3]. Our interest in these states, however, arises from their large tree size rather 
than their error-correcting properties. 

Let 8 be the following distribution over subgroups S. Choose an n/2 x n matrix A by setting 
each entry to or 1 uniformly and independently. Then let S = {x \ Ax = 0(mod2)}. By 
Theorem @J part (i), it suffices to lower-bound the multilinear formula size of the function fs (x), 
which is 1 if x E S and otherwise. 

Theorem 12 If S is drawn from £, then MFS (fs) = n Q ( lo s n ) (and hence TS(|S)) = n n( - logn ">), 
with probability Q (1) over S. 

Proof. Let P be a uniform random partition of the inputs x\,...,x n of fs into two sets 
y = (yi, . . . , y n ji) and z = (z\, . . . , z n / 2 ) ■ Let M s \p be the 2 n / 2 x 2 n / 2 matrix whose (y, z) entry 
is fs\p (y, z); then we need to show that rank (Ms\p) is large with high probability. Let A y be the 
n/2 x n/2 submatrix of the n/2 x n matrix A consisting of all rows that correspond to yi for some 
i £ {1, . . . ,n/2}, and similarly let A z be the n/2 x n/2 submatrix corresponding to z. Then it 
is easy to see that, so long as A y and A z are both invertible, for all 2 n / 2 settings of y there exists 
a unique setting of z for which /s|p (y, z) = 1. This then implies that Ms\p is a permutation 
of the identity matrix, and hence that rank (Ms\p) = 2 n l 2 . Now, the probability that a random 
n/2 x n/2 matrix over Z 2 is invertible is 

1 3 2 n / 2 - 1 

- • - ^— > 0.288. 

2 4 2 n / 2 

So the probability that A y and A z are both invertible is at least 0.288 2 . By Markov's inequality, 
it follows that for at least an 0.04 fraction of S's, rank (M s \p) = 2 n l 2 for at least an 0.04 fraction 
of .P's. Theorem then yields the desired result. ■ 

Aaronson and Gottesman 3. show how to prepare any n-qubit subgroup state using a quantum 
circuit of size O (n 2 / log n) . So a corollary of Theorem 1121 is that tyP <£_ Tree. Since fs clearly has 
a (non- multilinear) arithmetic formula of size O (nk), a second corollary is the following. 
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Corollary 13 There exists a family of functions f n : {0,1}™ — * R that has polynomial- size arith- 
metic formulas, but no polynomial- size multilinear formulas. 

The reason Corollary 1131 does not follow from Raz's results is that polynomial-size formulas for 
the permanent and determinant are not known; the smallest known formulas for the determinant 
have size n°( logn ) (see [IS]). 

We have shown that not all subgroup states are tree states, but it is still conceivable that all 
subgroup states are extremely well approximated by tree states. Let us now rule out the latter 
possibility. We first need a lemma about matrix rank, which follows from the Hoffman- Wielandt 
inequality. 



Lemma 14 Let M be an N x N complex matrix, and let In be the N x N identity matrix. Then 
\\M -I N \\ 2 2 > iV-rank(M). 

Proof. The Hoffman- Wielandt inequality [2H] (see also states that for any two N x N 
matrices M, P, 

N 

^2 (a t (M)- at (P)) 2 <\\M-P\\l, 

i=l 



where Oi (M) is the i th singular value of M (that is, Oi (M) = y% (M), where Ai (M) > • • • > 
A at (M) > are the eigenvalues of MM*, and M* is the conjugate transpose of M). Clearly 
o~i (In) = 1 for all i. On the other hand, M has only rank(M) nonzero singular values, so 



N 



£ fa (M) - a, (I N )) 2 >N- rank (M) . 



8=1 



Let fs (x) = fs (x) I \f\S~\ be fs (x) normalized to have fs 



= 1. 

2 



Theorem 15 For all constants e £ [0, 1), if S is drawn from £, then MFS e (^fs^j = n^ logn ) with 
probability £1 (1) over S. 

Proof. As in Theorem ll21 we look at the matrix Mg\ P induced by a random partition P = (y, z). 
We already know that for at least an 0.04 fraction of S's, the y and z variables are in one-to-one 
correspondence for at least an 0.04 fraction of P's. In that case \S\ = 2 n / 2 , and therefore M S \ P is 
a permutation of I/y/\S\ = J/2™/ 4 where I is the identity. It follows from Lemma ITU that for all 

II 1 1 2 

matrices M such that ||M — M,s|p|L < e, 

rank (M) > 2 n / 2 - || t/\S~\ (M - M S]P ) ||* > (1 - s) 2 n ' 2 
and therefore rank e (M^ip) > (1 — e) 2 n / 2 . Hence 

> 0.04, 



Pr 

Pev 



ranke (M f{P ) > 2 n l 2 ~^ 2 ^ I 2 



and the result follows from Corollary 1101 ■ 

A corollary of Theorem ^] and of Theorem part (hi), is that TS £ (\S)) = n^ ogn ^ with 
probability Q (1) over S, for all e < 1. 
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Finally, let us show how to derandomize the lower bound for subgroup states, using ideas pointed 
out to us by Andrej Bogdanov. In the proof of Theorem 1121 all we used about the matrix A was 
that a random k x k sub matrix has full rank with (1) probability, where k = n/2. If we switch 



from the field F2 to F 2 d for some d > log 2 n, 
with this same property. For example, let 



then it is easy to construct explicit k x n matrices 



V 



1° 

2° 



l 1 

2 1 

n 



lk-i 
2 k-i 



71 



fc-1 



be the n x k Vandermonde matrix, where 1, ... , n are labels of elements in ¥ 2 d. Any k x k 
submatrix of V has full rank, because the Reed-Solomon (RS) code that V represents is a perfect 
erasure code. 9 Hence, there exists an explicit state of n "qupits" with p = 2 d that has tree size 
n fi(iogn) — name ly the uniform superposition over all elements of the set {x | V T x = 0}, where V T 
is the transpose of V. 

To replace qupits by qubits, we concatenate the RS and Hadamard codes to obtain a binary 
linear erasure code with parameters almost as good as those of the original RS code. More explicitly, 
interpret F 2 d as the field of polynomials over F2, modulo some irreducible of degree d. Then let 
m (a) be the dx d Boolean matrix that maps q £ F 2 d to aq £ F 2 d, where q and aq are encoded 
by their d x 1 vectors of coefficients. Let H map a length-d vector to its length-2 rf Hadamard 
encoding. Then Hm (a) is a 2 d x d Boolean matrix that maps q £ F 2 d to the Hadamard encoding 
of aq. We can now define an n2 d x kd "binary Vandermonde matrix" as follows: 



Hm 
Hm 



(10) 
2° 



Hm 
Hm 



Hm 
Hm 



> 2 k-i\ 



\ 



\ Hm (n°) Hm (n 1 ) • • • Hm (n fc_1 ) / 

For the remainder of the section, fix k = n s for some 5 < 1/2 and d = O (logn). 

Lemma 16 A (kd + c) x kd submatrix of V^in chosen uniformly at random has rank kd (that is, 
full rank) with probability at least 2/3, for c a sufficiently large constant. 

Proof. We claim that |Vbi n ^| > (n — k) 2 d ~ 1 for all nonzero vectors u G F 2 d , where | | repre- 
sents the number of '1' bits. To see this, observe that for all nonzero u, the "codeword vector" 
Vu £ F" d must have at least n — k nonzero entries by the Fundamental Theorem of Algebra, where 
here u is interpreted as an element of F^. Furthermore, the Hadamard code maps any nonzero 
entry in Vu to 2 d ~ 1 nonzero bits in Vfr m u £ F 2 211 . 

Now let W be a uniformly random (kd + c) x kd submatrix of Vbi n - By the above claim, for 
any fixed nonzero vector u £ F 2 rf , 



Pr [Wu = 0] < { 1 



(n - k) 2° 



l \ kd+c 



11 



1 k \ 

2 + 2?~l) 



kd+c 



So by the union bound, Wu is nonzero for all nonzero u (and hence W is full rank) with probability 
at least 

\2__2nJ V n J V 2 2n , 

9 In other words, because a degree-(fc — 1) polynomial is determined by its values at any k points. 
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Since k = n 1 / 2 n W and d = O (logn), the above quantity is at least 2/3 for sufficiently large c. I 
Given an n2 d x 1 Boolean vector x, let / (x) = 1 if V^ n x = and / (x) = otherwise. Then: 



Theorem 17 MFS (/) = n n ( logn \ 

Proof. Let V y and V z be two disjoint kd x (kd + c) submatrices of V^ n chosen uniformly at 
random. Then by Lemma ITBl together with the union bound, V y and V z both have full rank with 
probability at least 1/3. Letting I = kd + c, it follows that 



Pr 

Reili 



rank (M f \ R ) > 2 



i-c - 1 



> " = n -o(logn) 

~ 3 



by the same reasoning as in Theorem HJ Therefore MFS (/) = n n ( lo s") by Corollary El ■ 

Let \S) be a uniform superposition over all x such that / (x) = 1; then a corollary of Theorem 
E| is that TS (\S)) = n n( - lo ^ n l Naturally, using the ideas of Theorem El one can a ls° show that 
TS £ (\S)) = n°( lo s n ) for all e < 1. 

5.2 Shor States 

Since the motivation for our theory was to study possible Sure/Shor separators, an obvious question 
is, do states arising in Shor's algorithm have superpolynomial tree size? Unfortunately, we are only 
able to answer this question assuming a number-theoretic conjecture. To formalize the question, 
let 



1 

\r) |x r modiV) 



2 n/2 

r=0 

be a Shor state. It will be convenient for us to measure the second register, so that the state of 
the first register has the form 

! 1 

\a+pZ) = —=^2\a + pi) 
v 1 i=0 

for some integers a < p and / = [(2 n — a — 1) jp\ . Here a + pi is written out in binary using n 
bits. Clearly a lower bound on TS (\a + pZ)) would imply an equivalent lower bound for the joint 
state of the two registers. Also, to avoid some technicalities we assume p is prime. Since our goal 
is to prove a lower bound, this assumption is without loss of generality. 

Given an n-bit string x = x n -\ . . . xq, let f n ,p,a(x) = 1 if x = a(modp) and f n ,p,a(%) = 
otherwise. Then TS(|a + pZ)) = O (MFS (f n ,p,a)) by Theorem |3J so from now on we will focus 
attention on f n ,p,a- 

Proposition 18 

(i) Let f n>p = f n ,p,o- Then MFS (f n ,p,a) ^ MFS (/ n +iogp,p) ; meaning that we can set a = 
without loss of generality. 

(ii) MFS(/ n , p ) =0(mm{n2 n /p,np}). 
Proof. 

(i) Take the formula for / ra +iogp,p) an d restrict the most significant logp bits to sum to a number 
congruent to — amodp (this is always possible since x — > 2 n x is an isomorphism of Z p ). 
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(ii) For MFS (f n ,p) = 0(n2 n /p), write out the x's for which f n ,p(x) = 1 explicitly. For 
MFS (f n ,p) = O (np), use the Fourier transform, similarly to Theorem 1261 part (v): 

fn, P (x) = II ex p (^r • 23 x i) ■ 

1 h=0 j=0 v 1 / 
This immediately yields a sum-of-products formula of size O (np). 

u 

We now state our number-theoretic conjecture. 
Conjecture 19 There exist constants 7, 5 G (0, 1) and a prime p = Q ( 2 n& ) for which the following 



holds. Let the set A consist of n 5 elements of {2°, . . . ,2 n x } chosen uniformly at random. Let S 
consist of all 2 n& sums of subsets of A, and let Smodp = {xmodp : x € S}. Then 



Pr 

A 



\Smodp\ > (1 + 7) — 



n -o(logn)^ 



Theorem 20 Conjecture^ implies that MFS (f n . p ) = n Q ( lo s n ) and hence TS (|pZ)) = n n ^ n \ 

Proof. Let / = f n>p and I = n s . Let R be a restriction of / that renames 21 variables 
yi, . . . ,yi, zi, . . . , zi, and sets each of the remaining n — 21 variables to or 1. This leads to a new 
function, (y, z), which is 1 if y+z+c = (modp) and otherwise for some constant c. Here we are 
defining y = 2 ai y\+- ■ -+2 ai yi and z = 2 bl z\+- ■ -+2 bl zi where a\, ■ ■ ■ , a/, 61, . . . , 6/ are the appropriate 
place values. Now suppose y modp and z modp both assume at least (1 + 7) p/2 distinct values as 
we range over all x £ {0, l} n . Then by the pigeonhole principle, for at least jp possible values of 
y modp, there exists a unique possible value of zmodp for which y + z + c = (modp) and hence 
fa (y, z) = 1. So rank (Mji#) > 7p, where Mf\ R is the 2 l x 2 l matrix whose (y, z) entry is fa (y, z). 
It follows that assuming Conjecture H9l 



Pr [rank (Mf\ R ) > jp] 



n 



-o(logn) 



Furthermore, 7p > 2 l ^ 8 / 2 for sufficiently large n since p = n(2 nS \ Therefore MFS (/) = 
n C(lo g n) by Corollary HIJ ■ 

Using the ideas of Theorem 1151 one can show that under the same conjecture, MFS e (f n ,p) = 
n fi(iogn) an( j rpg^ Qp%^ = n ^(iogn) £ Qr ^\ g < i — i n other words, there exist Shor states that cannot 

be approximated by polynomial-size trees. 

In an earlier version of this paper, Conjecture 1191 was stated without any restriction on how the 
set S is formed. The resulting conjecture was far more general than we needed, and indeed was 
falsified by Carl Pomerance (personal communication). 

5.3 Error Correction, Tree Size, and Persistence of Entanglement 

In this section we pursue a deeper understanding of our lower bounds. Recall the states for which 
we were most successful in proving lower bounds are exactly the states that arise in quantum error 
correction. Is this just a coincidence, or should it have been expected? Also, can Raz's technique 
be given any physical interpretation? 

Let |5) be a uniform superposition over the elements of some subset 5 C {0, 1}™. Then our 
first observation is that if the elements of S are codewords of a sufficiently good erasure code, then 
Corollary 1111 yields an n r2( - log?1 - ) tree size lower bound for 15"). 
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Theorem 21 Let I = n s for some 5 G (0, and let I < L < Suppose that \S\ = 2 n ~ L 

(that is, n — L bits are being encoded); and that for each x G S, if we are given n — l bits of x drawn 
uniformly at random together with their locations, then with probability 1 — o (1) we can recover x 
itself. Then TS (|5» = n n ^ n \ 

Proof. Let / (x) = 1 if x G S and otherwise. Then it suffices to show that 



Pr 



rank (M /|jR ) > 2 



l-lV s /2 



Clearly an x G S drawn uniformly at random has entropy n — L. So if i\, . . . , i n _i G {1, . . . , n} are 
drawn uniformly at random without replacement, then the subsequence x^, . . . , Xi n _ 2l has expected 
entropy {n — L), and the subsequence x^, . . . ,Xi n _ t has expected entropy (n — L). By 
Markov's inequality, therefore, the entropy of %i n _ 2l+1 , ■ ■ ■ conditioned on Xi x ,... is at 

least I — with probability at least 1/2 (since the entropy can never be greater than I). It follows 

that with probability at least 1/2 over the restriction R G TZi, there are at least 2 l ~ 2lL ' n > 2 l ~ ll/ ^ 2 
distinct settings of y G {0, 1}' for which fn (y, z) = 1. Here we have used the fact that L < ^t/s- 

But this then implies that rank (Mj|^) > 2'~' 1/8 / 2 with probability 1 — o(l). For given y, if there 
are two or more values of z for which fn(y,z) = 1, then x is not uniquely recoverable from the 
n — I bits outside of z. ■ 

The converse of Theorem is false. For choose S C {0, 1}™ uniformly at random subject 
to IS"! = 2 n ~ 1 . Then Corollary HU yields an n n ( lo s n ) lower bound on TSdS 1 )), but S does not 
correspond to any good error-correcting code. So roughly speaking, if \S) is a codeword state then 
| S) has large tree size, but not vice versa. 

We can gain further insight by asking what physical properties a codeword state has to have. 
One important property is "persistence of entanglement," introduced Diir and Briegel ^H] among 
others. This is the property of remaining highly entangled even after a limited amount of interaction 
with the environment. For example, the Schrodinger cat state (|0)® n + |l)® n ) /\/2 is in some sense 
highly entangled, but it is not persistently entangled, since measuring a single qubit in the standard 
basis destroys all entanglement. 

By contrast, consider the "cluster states" defined by Briegel and Raussendorf [TI]. These 
states have attracted a great deal of attention because of their application to quantum computing 
via 1-qubit measurements only jlU]. For our purposes, a two-dimensional cluster state is an equal 
superposition over all settings of a \pn x y/n array of bits, with each basis state having a phase of 
(— l) r , where r is the number of horizontally or vertically adjacent pairs of bits that are both '1'. 
Diir and Briegel ]T5| showed that such states are persistently entangled in a precise sense: one can 
distill n-partite entanglement from them even after each qubit has interacted with a heat bath for 
an amount of time independent of n. 

Persistence of entanglement seems related to how one shows tree size lower bounds using Raz's 
technique. For to apply Corollary ^2 one basically "measures" most of a state's qubits, then 
partitions the unmeasured qubits into two subsystems of equal size, and argues that with high 
probability those two subsystems are still almost maximally entangled. The connection is not 
perfect, though. For one thing, setting most of the qubits to or 1 uniformly at random is not the 
same as measuring them. For another, Theorem |S] yields n^ l ° sn ^ tree size lower bounds without 
the need to trace out a subset of qubits. It suffices for the original state to be almost maximally 
entangled, no matter how one partitions it into two subsystems of equal size. 

But what about 2-D cluster states — do they have tree size n^( logn )? We strongly conjecture 
that the answer is 'yes.' However, proving this conjecture will almost certainly require going 
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beyond Theorem [5J One will want to use random restrictions that respect the 2-D neighborhood 
structure of cluster states — similar to the restrictions used by Raz |41] to show that permanent and 
determinant have multilinear formula size n^ logn \ 

We end this section by showing that there exist states that are persistently entangled in the 
sense of Diir and Briegel [18] . but that have polynomial tree size. In particular, Diir and Briegel 
showed that even one-dimensional cluster states are persistently entangled. On the other hand: 

Proposition 22 Let 



2 n/2 



X2+X2X3-] \-X„-lX„ 



xe{o,ir 



Then TS(|V>)) = O (n 4 ) . 



P%* ) be an equal superposition over all n-bit strings x\ . . . x n 



Proof. Given bits i,j,k, let 
such that X\ = i, x n = k, and X1X2 + • • • + x n ^\x n = j (mod 2). Then 
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Therefore TS 

O (n 4 ) . Finally observe that 

|0} + |1) 



, and solving this recurrence relation yields TS 



V2 



V2 



r n 



6 Computing With Tree States 

Suppose a quantum computer is restricted to being in a tree state at all times. (We can imagine that 
if the tree size ever exceeds some polynomial bound, the quantum computer explodes, destroying 
our laboratory.) Does the computer then have an efficient classical simulation? In other words, 
letting TreeBQP be the class of languages accepted by such a machine, does TreeBQP = BPP? A 
positive answer would make tree states more attractive as a Sure/Shor separator. For once we 
admit any states incompatible with the polynomial-time Church- Turing thesis, it seems like we 
might as well go all the way, and admit all states preparable by polynomial-size quantum circuits! 
The TreeBQP versus BPP problem is closely related to the problem of finding an efficient (classical) 
algorithm to learn multilinear formulas. In light of Raz's lower bound, and of the connection 
between lower bounds and learning noticed by Linial, Mansour, and Nisan [36] : the latter problem 
might be less hopeless than it looks. In this section we show a weaker result: that TreeBQP is 
contained in Z3 n II3 , the third level of the polynomial hierarchy. Since BQP is not known to lie 
in PH, this result could be taken as weak evidence that TreeBQP 7^ BQP. (On the other hand, we 
do not yet have oracle evidence even for BQP <f. AM, though not for lack of trying |2j.) 
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Definition 23 TreeBQP is the class of languages accepted by a BQP machine subject to the con- 
straint that at every time step t, the machine's state l^w) is exponentially close to a tree state. 

More formally, the initial state is = |0)®^™)~ n ) (g) \x) (for an input x £ {0, 1}™ and poly- 

nomial bound p), and a uniform classical polynomial-time algorithm generates a sequence of gates 
, . . . , . Each can be either be selected from some finite universal basis of unitary 

gates (as we will show in Theorem \2J\ part (i), the choice of gate set will not matter), or can be 
a 1-qubit measurement. When we perform a measurement, the state evolves to one of two pos- 
sible pure states, with the usual probabilities, rather than to a mixed state. We require that the 
final gate is a measurement of the first qubit. If at least one intermediate state \tp^) had 

TS-|y 2 «(n) (l^^)) > p{ n )j then the outcome of the final measurement is chosen adversarially; oth- 
erwise it is given by the usual Born probabilities. The measurement must return 1 with probability 
at least 2/3 if the input is in the language, and with probability at most 1/3 otherwise. 

Some comments on the definition: we allow to deviate from a tree state by an exponen- 

tially small amount, in order to make the model independent of the choice of gate set. We allow 
intermediate measurements because otherwise it is unclear even how to simulate BP P. 10 The rule 
for measurements follows the "Copenhagen interpretation," in the sense that if a qubit is measured 
to be 1, then subsequent computation is not affected by what would have happened were the qubit 
measured to be 0. In particular, if measuring would have led to states of tree size greater than 
p (n), that does not invalidate the results of the path where 1 is measured. 

The following theorem shows that TreeBQP has many of the properties we would want it to 
have. 

Theorem 24 

(i) The definition of TreeBQP is invariant under the choice of gate set. 

(ii) The probabilities (1/3,2/3) can be replaced by any (p, l-p) with 2~ 2 " < p < 1/2. 
(Hi) BPP C TreeBQP C BQP. 

Proof. 

(i) The Solovay-Kitaev Theorem |321 137j shows that given a universal gate set, we can approxi- 
mate any fc-qubit unitary to accuracy 1/e using k qubits and a circuit of size O (polylog (1/e)). 
So let |^ (0) >,...,|^ (p(n)) > e H® p{n) be a sequence of states, with produced from |^(* ^ 
by applying a fc-qubit unitary (where k = O (1)). Then using a polynomial-size circuit, 
we can approximate each to accuracy l/2^( n ), as in the definition of TreeBQP. Fur- 
thermore, since the approximation circuit for acts only on k qubits, any intermediate 
state \<p) it produces satisfies TS^nrn) (\<p)) < k4 k TS 1 y 2 o(„) (|^>( t_1 ))) by Proposition |2J 

(ii) To amplify to a constant probability, run k copies of the computation in tensor product, then 
output the majority answer. By part (i), outputting the majority can increase the tree size 
by a factor of at most 2 k+1 . To amplify to 2~ 2V1 ° sn , observe that the Boolean majority 

10 If we try to simulate BPP in the standard way, we might produce complicated entanglement between the com- 
putation register and the register containing the random bits, and no longer have a tree state. 
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function on k bits has a multilinear formula of size 

k O(lo g k) _ For let T ft ( Xl 5 . . . ; Xfc ) equa i ± 

if x\ + • • • + Xk > /i and otherwise; then 



T£( Xl ,...,x k ) = 1 



i=0 



h—i 



so MFS (t£) < 2/i max; MFS (?f fe/2l ) +0 (1), and solving this recurrence yields MFS (t^ 2 

^.O(iogfc)^ Substituting k = 2^ logn into /c°( lo § fc ) yields n ^ 1 ), meaning the tree size increases 
by at most a polynomial factor. 

(hi) To simulate BPP, we just perform a classical reversible computation, applying a Hadamard 
followed by a measurement to some qubit whenever we need a random bit. Since the number 
of basis states with nonzero amplitude is at most 2, the simulation is clearly in TreeBQP. 
The other containment is obvious. 



Theorem 25 TreeBQP Cl 3 p n . 

Proof. Since TreeBQP is closed under complement, it suffices to show that TreeBQP C rig . Our 
proof will combine approximate counting with a predicate to verify the correctness of a TreeBQP 
computation. Let C be a uniformly-generated quantum circuit, and let M = (m^ 1 ', . . . ,m^ p ( n ^) 
be a sequence of binary measurement outcomes. We adopt the convention that after making a 
measurement, the state vector is not rescaled to have norm 1. That way the probabilities across 



all 'measurement branches' continue to sum to 1. Let 



./>(»)) 



be the sequence of 



unnormalized pure states under measurement outcome sequence M and input x, where 



^e{0,i} p(n) a y,M,x \v)- Also ' let A(M,x) express that TS 1/2 n(n) 
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< p (n) for every t. 
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ly,M,x 
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while C rejects if W x < 1/3. If we could compute each aj^ ^ x efficiently (as well as A (M, x)), 

we would then have a predicate expressing that W x > 2/3. This follows since we can do 
approximate counting via hashing in AM C |26j . and thereby verify that an exponentially large 
sum of nonnegative terms is at least 2/3, rather than at most 1/3. The one further fact we need 
is that in our fig (V3) predicate, we can take the existential quantifier to range over tuples of 



'candidate solutions' — that is, (M, y) pairs together with lower bounds (3 on 



a 



(p(n)) 
ly,M,x 



It remains only to show how we verify that A (M, x) holds and that m x = 0- First, we 
extend the existential quantifier so that it guesses not only M and y, but also a sequence of trees 

, ip^Mx^ respectively. Second, using the last universal 

we verify the following: 



r (0) j _ 5 T (p(n)) ) representing 
quantifier to range over y € {0, l} p ( n ) 



(1) is a fixed tree representing |0)' 



$(p(n)—ri) 



(g> \x). 



(2) 



O: 



(p(n)) 
ly,M,x 



equals its claimed value to £1 (n) bits of precision. 
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(3) Let g^> , . . . , g( p(n )) be the gates applied by C. Then for all t and y , if g® is unitary then 

/ to ^ ( n ) kits OI " precision. Here the right-hand side is a sum of 2 k 
: number of qubits acted on by g^'), each term efficiently computable given 
Similarly, if is a measurement of the i th qubit, then oJ£ Mx = a~ ^ x if the i th bit 

of y equals while a^l, = otherwise. 



In the proof of Theorem Pol the only fact about tree states we use is that Tree C AmpP; that 
is, there is a polynomial-time classical algorithm that computes the amplitude a x of any basis 
state \x). So if we define AmpP-BQP analogously to TreeBQP except that any states in AmpP are 
allowed, then AmpP-BQP Cl 3 p nn 3 p as well. 



7 The Experimental Situation 

The results of this paper suggest an obvious challenge for experimenters: prepare non-tree states 
in the lab. For were this challenge met, it would rule out one way in which quantum mechanics 
could fail, just as the Bell inequality experiments of Aspect et al. jH] did twenty years ago. If they 
wished, quantum computing skeptics could then propose a new candidate Sure/Shor separator, and 
experimenters could try to rule out that one, and so on. The result would be to divide the question 
of whether quantum computing is possible into a series of smaller questions about which states can 
be prepared. In our view, this would aid progress in two ways: by helping experimenters set clear 
goals, and by forcing theorists to state clear positions. 

However, our experimental challenge raises some immediate questions. In particular, what 
would it mean to prepare a non-tree state? How would we know if we succeeded? Also, have 
non-tree states already been prepared (or observed)? The purpose of this section is to set out our 
thoughts about these questions. 

First of all, when discussing experiments, it goes without saying that we must convert asymp- 
totic statements into statements about specific values of n. The central tenet of computational 
complexity theory is that this is possible. Thus, instead of asking whether n-qubit states with 
tree size 2^( n ) can be prepared, we ask whether 200-qubit states with tree size at least (say) 2 80 
can be prepared. Even though the second question does not logically imply anything about the 
first, the second is closer to what we ultimately care about anyway. Admittedly, knowing that 
TS(|^n» = n n( - 10 ^ tells us little about TS(|V>ioo» OT TS(|^oo», especially since in Raz's paper 
|41j . the constant in the exponent (logn) is taken to be 10 -6 (though this can certainly be im- 
proved). Thus, proving tight lower bounds for small n is one of the most important problems left 
open by this paper. In Appendix EH we solve the problem for the case of manifestly orthogonal 
tree size. 

A second common objection is that our formalism applies only to pure states, but in reality 
all states are mixed. However, there are several natural ways to extend the formalism to mixed 
states. Given a mixed state p, we could minimize tree size over all purifications of p, or minimize 
the expected tree size ^ |aj| 2 TS (\ipi)), or maximum maxj TS (\ipi)), over all decompositions p = 

A third objection is a real quantum state might be a "soup" of free-wandering fermions and 
bosons, with no localized subsystems corresponding to qubits. How can one determine the tree size 
of such a state? The answer is that one cannot. Any complexity measure for particle position and 
momentum states would have to be quite different from the measures considered in this paper. On 
the other hand, the states of interest for quantum computing usually do involve localized qubits. 
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Indeed, even if quantum information is stored in particle positions, one might force each particle 
into two sites (corresponding to |0) and |1)), neither of which can be occupied by any other particle. 
In that case it again becomes meaningful to discuss tree size. 

But how do we verify that a state with large tree size was prepared? Of course, if \tp) is 
preparable by a polynomial-size quantum circuit, then assuming quantum mechanics is valid (and 
assuming our gates behave as specified), we can always test whether a given state \ip) is close to 

or not. Let U map |0)® n to \ip); then it suffices to test whether C/ -1 \<p) is close to |0)® n . 
However, in the experiments under discussion, the validity of quantum mechanics is the very point 
in question. And once we allow Nature to behave in arbitrary ways, a skeptic could explain any 
experimental result without having to invoke states with large tree size. 

The above fact has often been urged against us, but as it stands, it is no different from the fact 
that one could explain any astronomical observation without abandoning the Ptolemaic system. 
The issue is not one of mathematical proof, but of accumulating observations that are consistent 
with the hypothesis of large tree size, and inconsistent with alternative hypotheses if we disallow 
special pleading. So for example, to test whether the subgroup state 

\S) = ^=Y\x) 

was prepared, we might use CNOT gates to map \x) to \x) \v T x) for some vector jjGZJ, Based 
on our knowledge of S, we could then predict whether the qubit \v T x^ should be |0), |1), or an 
equal mixture of |0) and |1) when measured. Or we could apply Hadamard gates to all n qubits 
of \S), then perform the same test for the subgroup dual to S. In saying that a system is in state 
\S), it is not clear if we mean anything more than that it responds to all such tests in expected 
ways. Similar remarks apply to Shor states and cluster states. 

In our view, tests of the sort described above are certainly sufficient, so the interesting question 
is whether they are necessary, or whether weaker and more indirect tests would also suffice. This 
question rears its head when we ask whether non-tree states have already been observed. For as 
pointed out to us by Anthony Leggett, there exist systems studied in condensed-matter physics 
that are strong candidates for having super polynomial tree size. An example is the magnetic salt 
LiHo^Yi_ a: F4 studied by Ghosh et al. which, like the cluster states of Briegel and Raussendorf 
14 , basically consists of a lattice of spins subject to pairwise nearest-neighbor Hamiltonians. The 
main differences are that the salt lattice is 3-D instead of 2-D, is tetragonal instead of cubic, and 
is irregular in that not every site is occupied by a spin. Also, there are weak interactions even 
between spins that are not nearest neighbors. But none of these differences seems likely to change 
a super polynomial tree size into a polynomial one. 

For us, the main issues are (1) how precisely can we characterize 11 the quantum state of the 
magnetic salt, and (2) how strong the evidence is that that is the state. What Ghosh et al. [21] 
did was to calculate bulk properties of the salt, such as its magnetic susceptibility and specific heat, 
with and without taking into account the quantum entanglement generated by the nearest-neighbor 
Hamiltonians. They found that including entanglement yielded a better fit to the experimentally 
measured values. However, this is clearly a far cry from preparing a system in a state of one's 
choosing by applying a known pulse sequence, and then applying any of a vast catalog of tests to 
verify that the state was prepared. So it would be valuable to have more direct evidence that 
states qualitatively like cluster states can exist in Nature. 

11 By "characterize," we mean give an explicit formula for the amplitudes at a particular time t, in some standard 
basis. If a state is characterized as the ground state of a Hamiltonian, then we first need to solve for the amplitudes 
before we can prove tree size lower bounds using Raz's method. 
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In summary, our results underscore the importance of current experimental work on large, 
persistently entangled quantum states; but they also suggest a new motivation and perspective for 
this work. They suggest that we reexamine known condensed-matter systems with a new goal in 
mind: understanding the complexity of their associated quantum states. They also suggest that 
2-D cluster states and random subgroup states are interesting in a way that 1-D spin chains and 
Schrodinger cat states are not. Yet when experimenters try to prepare states of the former type, 
they often see it as merely a stepping stone towards demonstrating error-correction or another 
quantum computing benchmark. Thus, Knill et al. 33 prepared 12 the 5-qubit state 



, ,, _ 1 / |00000) + 1 10010) + |01001) + |10100) + |01010) - |11011) - |00110) - |11000) \ 

W ~ 4 v - liiioi) - |00011) - |11110) - |01111) - |10001) - |01100) - |10111) + |00101) ) ' 

for which MOTS (|^>)) =40 from the decomposition 



and for which we conjecture TS = 40 as well. However, the sole motivation of the experiment 
was to demonstrate a 5-qubit quantum error-correcting code. In our opinion, whether states with 
large tree size can be prepared is a fundamental question in its own right. Were that question 
studied directly, perhaps we could address it for larger numbers of qubits. 

Let us end by stressing that, in the perspective we are advocating, there is nothing sacrosanct 
about tree size as opposed to other complexity measures. This paper concentrated on tree size 
because it is the subject of our main results, and because it is better to be specific than vague. On 
the other hand, Section^ Appendix|2l and AppendixllOl contain numerous results about orthogonal 
tree size, manifestly orthogonal tree size, Vidal's x complexity, and other measures. Readers 
dissatisfied with all of these measures are urged to propose new ones, perhaps motivated directly 
by experiments. We see nothing wrong with having multiple ways to quantify the complexity of 
quantum states, and much wrong with having no ways. 

8 Conclusion and Open Problems 

A crucial step in quantum computing was to separate the question of whether quantum computers 
can be built from the question of what one could do with them. This separation allowed computer 
scientists to make great advances on the latter question, despite knowing nothing about the former. 
We have argued, however, that the tools of computational complexity theory are relevant to both 
questions. The claim that large-scale quantum computing is possible in principle is really a claim 
that certain states can exist — that quantum mechanics will not break down if we try to prepare 
those states. Furthermore, what distinguishes these states from states we have seen must be more 
than precision in amplitudes, or the number of qubits maintained coherently. The distinguishing 
property should instead be some sort of complexity. That is, Sure states should have succinct 
representations of a type that Shor states do not. 

We have tried to show that, by adopting this viewpoint, we make the debate about whether 
quantum computing is possible less ideological and more scientific. By studying particular examples 

12 Admittedly, what they really prepared is the 'pseudo-pure' state p — e \tp) + — e) I, where I is the maximally 
mixed state and e ~ 10~ 5 . Braunstein et al. have shown that, if the number of qubits n is less than about 14, 
then such states cannot be entangled. That is, there exists a representation of p as a mixture of pure states, each of 
which is separable and therefore has tree size O (n). This is a well-known limitation of the liquid NMR technology 
used by Knill et al. Thus, a key challenge is to replicate the successes of liquid NMR using colder qubits. 





(|01) + 1 10)) <8> (|010) - |111>) + (|01) - |10>) ® (|001) - | 
- (|00) + |11)) ® (|011) + |110>) + (| 00) - |11)) <g> (| 000) + 



|100)) \ 

-iioi)) ; 
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of Sure/Shor separators, quantum computing skeptics would strengthen their case — for they would 
then have a plausible research program aimed at identifying what, exactly, the barriers to quantum 
computation are. We hope, however, that the 'complexity theory of quantum states' initiated in 
this paper will be taken up by quantum computing proponents as well. This theory offers a new 
perspective on the transition from classical to quantum computing, and a new connection between 
quantum computing and the powerful circuit lower bound techniques of classical complexity theory. 
We end with some open problems. 

(1) Can Raz's technique be improved to show exponential tree size lower bounds? 

(2) Can we prove Conjecture 1191 implying an n ( logn ) tree size lower bound for Shor states? 

(3) Let \(p) be a uniform superposition over all n-bit strings of Hamming weight n/2. It is easy 
to show by divide-and-conquer that TS (\(p)) = n°^ ogn \ Is this upper bound tight? More 
generally, can we show a superpolynomial tree size lower bound for any state with permutation 
symmetry? 

(4) Is Tree = CTree? That is, are there tree states that are not orthogonal tree states? 

(5) Is the tensor-sum hierarchy of Section infinite? That is, do we have Zk 7^ Zk+i for all kl 

(6) Is TreeBQP = BPP? That is, can a quantum computer that is always in a tree state be 
simulated classically? The key question seems to be whether the concept class of multilinear 
formulas is efficiently learnable. 

(7) Is there a practical method to compute the tree size of, say, 10-qubit states? Such a method 
would have great value in interpreting experimental results. 
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9 Appendix: Relations Among Quantum State Classes 



This appendix presents some results about the quantum state hierarchy introduced in Section |21 
Theorem |^1 shows simple inclusions and separations, while Theorem |27] shows that separations 
higher in the hierarchy would imply major complexity class separations (and vice versa). 

Theorem 26 

(i) Tree U Vidal C Circuit C AmpP. 

(ii) All states in Vidal have tree size n°( logn \ 
(in) Z 2 C Vidal but <g> 2 <f. Vidal. 

(iv) ® 2 £ MOTree. 

(v) Y-i, Z 2 , ^3; <8>ij ®2, o,nd ®3 ore all distinct. Also, (g>3 / Z4 n (8)4. 
Proof. 

(i) Tree C Circuit since any multilinear tree is also a multilinear circuit. Circuit C AmpP since the 
circuit yields a polynomial-time algorithm for computing the amplitudes. For Vidal C Circuit, 
we use an idea of Vidal 0^1: given \ip n ) £ Vidal, for all j £ {1, . . . , n} we can express \ip n ) as 



Ji-i] 



,[j+l...n] 



i=l 



where x (iV'n)) is polynomially bounded. Furthermore, Vidal showed that each 



be written as a linear combination of states of the form 



the point being that the set of 



Ji-i-i] 



Ji-i-i] 



i|0) and 



states is the same, independently of 



i 

[i-i] 



can 

1>- 
This 



immediately yields a polynomial-size multilinear circuit for \ip n )- 
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(ii) Given \ip n ) G Vidal, we can decompose \tp n ) as 



x(IV>» 
i=i 







[1...U/2] 



,[n/2+l...n] 



Then x ( <^f "' n ^ 2 '^ < x (IV'n)) and X ( 0|"/ 2+1 - n ^ < ^ for all i, so we can recursively 

decompose these states in the same manner. It follows that TS(\ip n )) < 2x (|^)) TS (| VV1/2)) i 
solving this recurrence relation yields TS (\ip n )) < (2 X {\ip))f° sn = n° (logn) . 

(hi) T2 C Vidal follows since a sum of t separable states has x < t, while 02 Vidal follows from 
the example of n/2 Bell pairs: 2~ n / 4 (|00) + |ll))® n/2 . 

(iv) 02 C MOTree is obvious, while MOTree <f_ 02 follows from the example of |P„), an equal 
superposition over all n-bit strings of parity i. The following recursive formulas imply that 
MOTS fli*)) < 4 MOTS ( P„ /2 )) = O (n 2 ): 



LP 1 



( 



p 



n/2 



P 



n/2 



+ 



P 



n/2 



P 



n/2 



On the other hand, |P n ) ^ 02 follows from \P n ) ^ 5^ together with the fact that \P n ) has no 
nontrivial tensor product decomposition. 

(v) 0i (£_ Zi and Zi <£_ 0i are obvious. 02 $z! Z2 (and hence 0i 7= 02) follows from part (iii). 
Z2 {Z! 02 (and hence Zi 7^ Z2) follows from part (iv), together with the fact that |P n ) has a 
Z2 formula based on the Fourier transform: 



1 



^ = 7=2 



|0) + |1> 



+ 



|0) 



ID 



Z2 7^ Z3 follows from 02 Z2 and 02 C Z3. Also, Z3 03 follows from Z2 7^ Z3, together 
with the fact that we can easily construct states in Z3 \ Z2 that have no nontrivial tensor 
product decomposition — for example, 



+ 



|01) + 1 10) 

V2 



®n/2 N 



02 / 03 follows from Z2 <t 02 and Z2 Q 03 • Finally, 03 7^ Z4 n 04 follows from Z3 03 
and I3 C I 4 n 04. 



Theorem 27 

(i) BQP = P #p implies AmpP C v|/P. 
Cm; AmpP C M/P impfos NP C BQP/poly. 
(m,) P = P* p implies v|/P C AmpP. 
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(iv) VP C AmpP implies BQP C P/poly. 
Proof. 

(i) First, BQP = P #p implies BQP/poly = P #p /poly, since given a P* p /poly machine M, the 
language consisting of all (x, a) such that M accepts on input x and advice a is clearly in 
BQP. So assume BQP/poly = P# p /poly, and consider a state = J2 x ^{o i}" ax l x ) wu ^ n 

G AmpP. By the result of Bernstein and Vazirani jHj that BQP C P# p , for all b there exists 
a quantum circuit of size polynomial in n and b that approximates po = ^2 ye ^ Q i}™- 1 l a 0i/| 2 ) 
or the probability that the first qubit is measured to be 0, to b bits of precision. So by 
uncomputing garbage, we can prepare a state close to ^/po\0) + y/1 — po | 1). Similarly, given 
a superposition over length-/c prefixes of x, we can prepare a superposition over length-(£; + 1) 
prefixes of x by approximating the conditional measurement probabilities. We thus obtain a 
state close to \a x \ \x). The last step is to approximate the phase of each \x), apply that 
phase, and uncompute to obtain a state close to ^ x a x \x). 

(ii) Given a SAT instance, first use Valiant- Vazirani jl^j to produce a formula <f> with either 
or 1 satisfying assignments. Then let a x = 1 if x is a satisfying assignment for (p and 
a x = otherwise; clearly = Y2 x a x\ x ) ^ s m AmpP. By the assumption AmpP C i|/P, 
there exists a polynomial-size quantum circuit that approximates \ip), and thereby finds the 
unique satisfying assignment for ip if it exists. 

(hi) As in part (i), P = P* p implies P/poly = P* p /poly. The containment v|/P C AmpP follows 
since we can approximate amplitudes to polynomially many bits of precision in ^P. 

(iv) As is well known [H], any quantum computation can be made 'clean' in the sense that it 
accepts if and only if a particular basis state (say |0)® n ) is measured. The implication 
follows easily. 



10 Appendix: Manifestly Orthogonal Tree Size 

This appendix studies the manifestly orthogonal tree size of coset states: 13 states having the form 

\c) = ^=y \x) 

where C = {x \ Ax = 6} is a coset in Zg. In particular, we present a tight characterization 
of MOTS(|C)), which enables us to prove exponential lower bounds on it, in contrast to the 
n Q(iogn) j ower bounds for ordinary tree size. This characterization also yields a separation between 
orthogonal and manifestly orthogonal tree size; and an algorithm for computing MOTS (|C)) whose 
complexity is only singly exponential in n. Our proof technique is independent of Raz's, and is 
highly tailored to take advantage of manifest orthogonality. However, even if our technique finds 
no broader application, the fact that it gives tight bounds makes it almost unique — and thus, we 
hope, of interest to complexity theorists. 

13 All results apply equally well to the subgroup states of Section f5. II the greater generality of coset states is just 
for convenience. 
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Given a state \ip), recall that the manifestly orthogonal tree size MOTS(|?/>)) is the minimum 
size of a tree representing \ip), in which all additions are of two states , \1fj2) with "disjoint 
supports" — that is, either {tpi\x) = or (ip2\x) = for every basis state \x). Here the size \T\ of T 
is the number of leaf vertices. We can assume without loss of generality that every + or <g) vertex 
has at least one child, and that every child of a + vertex is a ® vertex and vice versa. Also, given 
a set S C {0, l} n , let 

\S) = -^=Y\x) 
VW\tts 

be a uniform superposition over the elements of S, and let M (S) be a shorthand for MOTS (\S)). 

Let C = {x : Ax = b} be a subgroup in Z£, for some A £ Z% xn and b G Z|. Let [n] = {1, . . . , n}, 
and let (I, J) be a nontrivial partition of [n] (one where / and J are both nonempty). Then clearly 
there exist distinct cosets cf\ . . . , c\ H ^ in the I subsystem, and distinct cosets C^j\ . . . , Cj H ^ in 
the J subsystem, such that 



c= (J cf ] ®cf ] 

he[H] 



cf 



The cf^'s and cj^'s are unique up to ordering. Furthermore, the quantities 

M^cf 1 ^, and M (Cj 1 ^ remain unchanged as we range over h £ [H]. For this reason we 
suppress the dependence on h when mentioning them. 

For various sets S, our strategy will be to analyze M (S) /\S\, the ratio of tree size to cardinality. 
We can think of this ratio as the "price per pound" of S: the number of vertices that we have to 
pay per basis state that we cover. The following lemma says that, under that cost measure, a coset 
is "as good a deal" as any of its subsets: 

Lemma 28 For all cosets C, 

M(C) fM(S) 
- mm 



\c\ V \s\ 

where the minimum is over nonempty S C C . 

Proof. By induction on n. The base case n = 1 is obvious, so assume the lemma true for 
n — 1. Choose S* C C to minimize M (S*) / \ S*\. Let T be a manifestly orthogonal tree for \S*) 
of minimum size, and let v be the root of T. We can assume without loss of generality that v is 
a <S> vertex, since otherwise v has some <8> child representing a set R C S* such that M (R) j \R\ < 
M (S*) / \S*\. Therefore for some nontrivial partition (I, J) of [n], and some S*j C {O,!}^ 1 and 
S} C {0,1} |J| , we have 

\S*} = \S*j)®\S*j), 

5*1 I Q* I I Q* I 

I — I "J I l D JI ) 

M(S*) = M (SJ) + M (S}) , 

where the last equality holds because if M (S*) < M (SJ) + M (5 1 }), then T was not a minimal tree 
for j 5*). Then 

M(S*) _ M (Sj) + M (Sj) _ f M (S!) + M (Sj)' 



\s*\ mis}] v i^/ii^i 

where the minimum is over nonempty Si C {0, 1}'^' and Sj C {0, l}'* 7 ' such that Si ® Sj C C. 
Now there must be an /i such that S} C c} h) and 5} C C^, since otherwise some x 4 C would be 
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assigned nonzero amplitude. By the induction hypothesis, 



M(C7) . fM(S!)\ M(Cj) . fM(Sj 

- mm — — — — , — — — — = mm 



\CA V \si\ y \Cj\ v m 

where the minima are over nonempty Si C Cj and Sj C Cj respectively. Define /? = 

• \Sj\ /M(Sj) and 7 = |5j| • |5/| /M(5j). Then since setting 5/ := cf } and Sj := cj ft) 
maximizes the four quantities \Sj\, \Sj\, \Si\ /M (Si), and \Sj\ /M (Sj) simultaneously, this choice 
also maximizes (3 and 7 simultaneously. Therefore it maximizes their harmonic mean, 

/? 7 |5j| \Sj\ \S\ 



/3 + 7 M(5/) + M(5j) M(S) 

We have proved that setting S := (g) Cj maximizes |S'|/M(5), or equivalently minimizes 

M (S 1 ) / |<5|. The one remaining observation is that taking the disjoint sum of Cf > ® cj ft) over all 
/i G [.H] leaves the ratio M (S) / \S\ unchanged. So setting S := C also minimizes M (S) / \S\, and 
we are done. ■ 

We are now ready to give a recursive characterization of M (C). 
Theorem 29 If n > 2, then 

'M(Ci)+M(Cj) 



M(C) = |C|min 



\Ci\ \Cj\ 

where the minimum is over nontrivial partitions (I, J) of [n] . 

Proof. The upper bound is obvious; we prove the lower bound. Let T be a manifestly 
orthogonal tree for \C) of minimum size, and let v^ l \ . . . , v ( L ' be the topmost ® vertices in T. 
Then there exists a partition (S^\ . . . , S^) of C such that the subtree rooted at represents 
\S®). We have 



T\ = M ( S w ) H \-M ( S^ L) 



s m 



M( S W) + _ + s(L) M(S^) 



\sw\ 



\s( L )\ 



Now let 77 = minj [M (S^) / \S^\). We will construct a partition (R^\ R^) of C such that 
M (R^) /\R^\ = rj for all h G [if], which will imply a new tree T with |T'| < \T\. Choose 
j G [L] such that M (S^A / \S^'\ = r], and suppose vertex u^") of T expresses \S^n as \Sj) (8> |5j) 
for some nontrivial partition (J, J). Then 

_ M (5C?)) _ M (Si) + M (Sj) 



\SU)\ \Si\\Sj\ 

where M (S^^ = M (Sj) +M (Sj) follows from the minimality of T. As in Lemma there must 

be an h such that Si C cf } and Sj C cj fc) . But Lemma EE] then implies that M(C I )/\C I \ < 
M (Si) / |5/| and that M(Cj) / \Cj\ < M (Sj) /\Sj\. Combining these bounds with \d\ > \Sj\ 
and \Cj\ > \Sj\, we obtain by a harmonic mean inequality that 

M (Ci (g> Cj) M (d) + M (Cj) M(S*j)+M(Sj) = 
\d®Cj\ ~ \Ci\\Cj\ ~ |5|||5}| V ' 
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So setting := Cj <8> for all h E [ii] yields a new tree T 1 no larger than T. Hence by the 
minimality of T, 

M (C) = \T\ = \T'\ = H- M (Cj ® Cj) = — l^L . ( M (Cj) + M (Cj)) . 

|w| lw| 

■ 

We can express Theorem 1291 directly in terms of the matrix A as follows. Let M (A) = M (C) = 
MOTS (| C)) where C = {x : = b} (the vector 6 is irrelevant, so long as Ax = b is solvable). 
Then 

M (A) = min (V ank ( A ')+ rank (^)- rank (^) (M (Aj) + M (Aj))) (*) 



where the minimum is over all nontrivial partitions (Aj, Aj) of the columns of A. As a base case, 
if A has only one column, then M {A) = 2 if A = and M {A) = 1 otherwise. This immediately 
implies the following. 

Corollary 30 There exists a deterministic O (n3 n )-time algorithm that computes M (A), given A 
as input. 

Proof. First compute rank (A*) for all 2™" 1 matrices A* that are formed by choosing a subset 
of the columns of A. This takes time O (n 3 2 n ). Then compute M (A*) for all A* with one column, 
then for all A* with two columns, and so on, applying the formula Q recursively. This takes time 



£Q^ = 0(n3"). 



Another easy consequence of Theorem 1291 is that the language {^4 : M (A) < s} is in NP. We 
do not know whether this language is NP-complete but suspect it is. 

As we mentioned, our characterization lets us prove exponential lower bounds on the manifestly 
orthogonal tree size of coset states. 

Theorem 31 Suppose the entries of A £ Z^*™ are drawn uniformly and independently at random, 



where k £ 



41og 2 n, ^\Aihi2 . Then M (A) = (n/k 2 )^ with probability (1) over A. 



Proof. Let us upper-bound the probability that certain "bad events" occur when A is drawn. 
The first bad event is that A contains an all-zero column. This occurs with probability at most 
2~ k n = o (1). The second bad event is that there exists a k x d submatrix of A with d > 12k that 
has rank at most 2fc/3. This also occurs with probability o(l). For we claim that, if A* is drawn 
uniformly at random from Z 2 X , then 

Pr[rank(A*)<r]<(^ (£f ? . 

To see this, imagine choosing the columns of A* one by one. For rank (A*) to be at most r, there 
must be at least d — r columns that are linearly dependent on the previous columns. But each 
column is dependent on the previous ones with probability at most 2 r /2 k . The claim then follows 
from the union bound. So the probability that any k x d submatrix of A has rank at most r is at 
most 
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Set r = 2k/3 and d = 12k; then the above is at most 

2k 



exp 1 12k log n + y log (12k) - I 1 2k 

where we have used the fact that k > 41ogn. 

Assume that neither bad event occurs, and let U?\Af 



2k\ k 
T ) 3 



o(l) 



be a partition of the columns of A 



that minimizes the expression Q. Let A^ = A^ if 



(0) 



where 



.4 



(0) 



A(2) 



and 



(o) 



A 

(o) 



> 



,(0) 
i(0) 



and A^ 1 ) = A^ otherwise 



are the numbers of columns in A\ and Aj respectively (so that 



.4 



(0) 



+ 



Likewise, let [A [ p,A { j ] 



A? if 



A 



(i> 



be an optimal partition of the columns of A^ l \ and let 

(1) and A® = A { p other wise. Continue in this way until an A w is 

reached such that LAW = 1. Then an immediate consequence of Q is that M (A) > Z(°) Z^" 1 ) 

where 



> 



A 



and A(°) = A. 

Call / a "balanced cut" if min • 



,rank( A 



^ ) )+rank( J 4 ( / i) )-rank(A( i )) 



A 



I is a balanced cut, then rank f A 



i(0 



A 



(0 



| > 12k, and an "unbalanced cut" otherwise. If 



> 2k/3 and rank (Ay) > 2k/3, so Z® > 2 k / 3 . If I is an 
unbalanced cut, then call I a "freebie" if rank (AfA + rank (aJ } ) = rank (A«). There can be at 

most k freebies, since for each one, rank (A^ +1 ^) < rank (A^) by the assumption that all columns 
of A are nonzero. For the other unbalanced cuts, > 2. 

Assume |A^ +1 )| = |A W | /2 for each balanced cut and |A(' +1 )| = |A W | — 12fc for each unbalanced 
cut. Then if our goal is to minimize Z^ Z^~^ , clearly the best strategy is to perform balanced 



cuts first, then unbalanced cuts until |A®| 
B be the number of balanced cuts; then 



12/c 2 , at which point we can use the k freebies. Let 



Z (0) 



V-(t-l) _ (2k/3\ B 2(n/2 s -12/c 2 )/12A: 



This is minimized by taking 1? = log 2 ( ^Si 2 ) j m which case 

A final application of our characterization is to separate orthogonal from manifestly orthogonal 
tree size. 

Corollary 32 There exist states with polynomially-bounded orthogonal tree size, but manifestly 
orthogonal tree size n n ( logn \ Thus CTree / MOTree. 

Proof. Set k = 41og 2 n, and let C = {x : Ax = 0} where A is drawn uniformly at random from 



ykxn 



Then by Theorem 13 1| 



MOTS(|C» = (n/k 



2 ^(fc) _ ^n(logn) 



with probability Q (1) over A. On the other hand, if we view \C) in the Fourier basis (that is, 
apply a Hadamard to every qubit), then the resulting state has only 2 k = n 16 basis states with 
nonzero amplitude, and hence has orthogonal tree size at most n 17 . So by Proposition part (i), 
OTS(|C» < 2n 17 as well. ■ 

Indeed, the orthogonal tree states of Corollary |22 are superpositions over polynomially many 
separable states, so we also obtain that Z2 (£. MOTree. 
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